Method for aggregating data in a network

ABSTRACT

A method for aggregating data in a network, particularly in a wireless sensor network, wherein the network ( 1 ) includes a plurality of sensor nodes (N i ) to measure data and at least one sink node (S) at which the data measured by the sensor nodes (N i ) are aggregated, and wherein each sensor node (N i ) encrypts its measured data with a key k and forwards the result towards the sink node (S), is characterized in that, in the context of a key distribution within the network ( 1 ), a master key K is chosen, and that the master key K is autonomously split up by the network ( 1 ) into individual keys k i  to be used by the sensor nodes (N i ) for encrypting measured data, with the sum of all individual keys k i  being equal to the master key K.

The present invention relates to a method for aggregating data in a network, particularly in a wireless sensor network, wherein the network comprises a plurality of sensor nodes to measure data and at least one sink node at which the data measured by the sensor nodes are aggregated, and wherein each sensor node encrypts its measured data with a key k_(i) and forwards the result towards the sink node.

Methods as mentioned above are well-known in practice and are of special importance in the context of wireless sensor networks (WSNs). WSNs are ad-hoc networks composed of, in general, miniaturized sensor nodes with limited computation and energy capacities. Sensor nodes, in general, consist of a probe, a processing unit, a communication device, and a battery and dispose of the functionality of data acquisition, communications and computation.

Wireless sensor networks are becoming increasingly popular in many spheres of life. The field of application of sensor networks includes but is by no means limited to monitoring and controlling machines, controlling (intra- and extra-corporal) health parameters or environmental monitoring (like temperature, humidity and seismic activity measurement). The range of application possibilities is almost infinite, though. In specific application fields, such as water contamination examination or weather forecasting, for example, it is extremely advantageous that the sensor nodes may be manufactured in miniature size making it possible to establish sensor networks even in geographical regions that are hard to access.

The communication among the sensor nodes is performed wirelessly via broadcast messages. This makes sending a highly costly and unreliable operation which should be avoided as much as possible. As a rule of thumb sending is roughly 100-1000 times more energy consuming than one computation step. Therefore, any mechanism that allows the reduction of the transmission helps to improve the time life of a WSN.

One possible approach for transmission reduction is data aggregation. In certain fields of operation, the user is not interested in the individual measurements but rather in their aggregation, e.g. their sum or the average value. In these cases, it is not necessary to send all measurements from the sensor nodes up to the sink node. Instead, these data can be processed within the network and only the result, e.g. the sum, is forwarded. As the data size of the sum is much smaller than the concatenation of all the summands, this helps to lower the transmission overhead.

As sensor nodes, due to their limited capacities, in general, do not comprise a tamper resistant unit security issues as regards a concealed communication among the nodes is crucial. Particularly in WSNs with a massive number of sensor nodes aiming at a deployment in a volatile, untrusted or even hostile environment nodes must be equipped with security mechanisms which support message exchange between the nodes.

Stimulated by the observation that for non tamper resistant sensor nodes, which are most likely for future large scaled sensor networks, hop-by-hop security as it is provided with proprietary solutions like TinySec (the security module of TinyOS) or a standardised security protocol like IEEE 802.15.4 can only provide a very limited overall system security, efficient solutions should aim at mechanisms that provide end-to-end security. In this context, end-to-end security means that only the sink node can access the data that was individually encrypted by the different sensor nodes but no node in between. In the best case, it should be possible to combine end-to-end encryption with data aggregation. Although providing data aggregation with end-to-end security in WSNs is a challenging task due to the highly self-organising distributed structure of WSNs, some progress in this direction has already been made.

For example, in C. Castelluccia, E. Mykletun, and G. Tsudik, Efficient aggregation of encrypted data in wireless sensor networks, in 2^(nd) Annual International Conference on Mobile and Ubiquitous Systems Networking and Services, San Diego, Calif., USA, July 2005, the focus is on end-to-end encryption of real-time traffic in synchronous WSNs and a stream cipher-based symmetric encryption transformation is proposed. By using pairwise keys, the scheme provides a high system security. Also, the pre-distribution of keys is quite simple as the keys can be randomly distributed to the nodes and only the sink nodes needs to store all the keys. However, the benefit of the overall system security comes at the cost of additional overhead. Firstly, the nodes' configuration before node deployment requires a pairwise pairing between each sensor node and the sink node to agree on the shared key. Secondly, although the sink node is considered to be equipped with much more memory than a sensor node, if one is aiming at large scaled sensor networks, the storage of thousands or hundreds of thousands of pairs (ID, key) may turn out to be a serious problem. Thirdly, since the focus is on security solutions over a highly unreliable medium one cannot ignore the impact of packet loss on the wireless broadcast medium. Revealing per data transmission the key IDs respectively node IDs of all the currently involved nodes therefore becomes mandatory for converge-cast traffic encrypted with the privacy homomorphic encryption scheme by Castelluccia, Mykletun and Tsudik. This enormously increases the size of the transmission data that has to be sent by the nodes. As sending is by far the most costly operation in terms of energy consumption, this approach reduces the life time of a WSN and therefore decreases the users' acceptance.

It is therefore an object of the present invention to improve and further develop a method of the initially described type for aggregating data in a network in such a way that by employing mechanisms that are readily to implement a high security is provided for the data aggregation process, while the communication overhead is reduced compared to the schemes according to the state of the art.

In accordance with the invention, the aforementioned object is accomplished by a method comprising the features of claim 1. According to this claim such a method is characterized in that, in the context of a key distribution within the network, a master key K is chosen, and that the master key K is autonomously split up by the network into individual keys k_(i) to be used by the sensor nodes for encrypting measured data, with the sum of all individual keys k_(i) being equal to the master key K.

According to the invention it has first been recognised that a high security level with a reduced communication overhead can be realised by implementing a key distribution process within a network that is based on a master key K that is autonomously split up by the network into individual keys k_(i) to be used by the sensor nodes for encrypting measured data. The autonomous splitting up of the master key K is performed in such a way that the sum of all individual keys k_(i) is equal to the master key K. By this means a self-organised key distribution is realized, making this process more comfortable for the network's owner/administrator since the nodes configuration before node deployment does not require a pairwise pairing between each sensor node and the sink node to agree on a shared key. Furthermore, the method according to the invention does not require additional data overhead when sending encrypted traffic towards the direction of the sink node. Moreover, by employing pairwise unique keys for end-to-end encryption a high overall system security is achieved since there is no lack of security at single sensor nodes.

Advantageously, the master key K is chosen by the sink node. The sink node, in general, is equipped with much more power and storage capacity than a sensor node. This factor together with its central roll within the network predestines the sink node for initiating the key distribution process. As regards a further improvement of security it proves to be advantageous that the master key K is solely stored at the sink node. In this case only the sink node is enabled to decrypt the final aggregated value by applying the master key K to the received ciphers. It is pointed out that the sink node knows the sum K stored in the sensor network without knowing the individual keys k_(i).

Preferably, the flow of data in the network has a converge-cast traffic structure. This kind of structure that is directed towards a central point—the sink node—is best suited for a key distribution scheme that supports the establishment of pairwise unique keys for end-to-end encryption.

Concretely, each sensor node of the network may have one predecessor node (that is closer to the sink node than the sensor node itself). The only exception may be the sink node which—as the destination point of the converge-cast traffic structure—does not have any predecessor node. Accordingly, looking away from the centre of the network, the sensor nodes may have one or more successor nodes. In this context it becomes clear to someone skilled in the art, that not every sensor node will have a successor node. For example, the outermost sensor nodes of a network do not have a successor node. Such nodes could be denominated as leaf nodes, thereby referring to the tree or tree like structure of the sensor network.

In an especially advantageous way, means are provided by which each sensor node is informed of its relative position within the network. In other words, after an initial bootstrap phase each sensor node knows his predecessor node and his successor nodes. In this regard it is to be preferred that the sensor nodes in the network are quasi stationary after once being distributed. It is to be noted that a sensor nodes' knowledge of its relative position does not include location-awareness with respect to its absolute position.

Based on the relative position information it may be provided that each sensor node, in the context of the key distribution process, upon receipt of a certain subtotal K_(i) of the master key K from its predecessor node splits up the value K_(i) into a number n of subtotals K′_(i). As regards an optimal adaptation to the network's structure each sensor node may define the number n of subtotals K′_(i) in accordance with the number of its direct successor nodes. For example, if a sensor node knows that is has three successor nodes and if this sensor node receives the value K_(i) from its predecessor node it splits up this value in four subtotals with K_(i)=K_(i)*+K′₁+K′₂+K′₃. K_(i)* will be used by the sensor node later on to encrypt its data. The other subtotals are passed to the successor nodes to be split up again according to the networks structure and so on. From a security point of view it is advantageous that the sensor node itself only stores K_(i)* permanently and deletes the other subtotals from its memory after passing them to the appropriate successors. At the end of the key distribution process, each sensor node possesses one key such that the sum of all the keys distributed within the networks equals the master key K.

In a preferred embodiment at least some of the sensor nodes may function as aggregator nodes by aggregating the measured data of its successor nodes. By this means the number of messages that is needed to transport the measured data of all sensor nodes to the sink node can be reduced drastically. As sending is the most costly operation in wireless sensor networks data aggregation is highly favourable.

The aggregation process carried out by an aggregator node may include the steps of performing an aggregation function on incoming encrypted data, which is forwarded to the aggregator node by its successor nodes, as well as on its own measured encrypted data, and the step of forwarding the aggregated result towards the sink node. Concretely, the aggregation function may be an additional operation, a subtraction operation, a multiplication operation or an inverse multiplication operation.

In an especially advantageous embodiment, the algorithm used for encryption is homomorphic both with respect to the keys k_(i) used for encryption and with respect to the values v_(i) to be encrypted. This kind of encryption function is called bihomomorphic. This means that for any two keys k₁ and k₂ as well as for any two plaintext values v₁ and v₂, it holds that

E _(k1)(v ₁)+E _(k2)(v ₂)=E _(k1+k2)(v₁+v₂).

A simple example of such an algorithm is E_(k)(v):=k+v mod n, where both key k and value v are some integers in the range from 0 to n−1. In the case the aggregation operation is the addition operation the sink node wants to know the sum of the measured values v₁+ . . . +v_(n), where n denotes the number of sensor nodes in the network. Due to the bihomomorphic structure of the encryption algorithm, the sink node which wants to decrypt

E _(k1)(v ₁)+ . . . +E _(kn)(v _(n))=E _(k1+ . . . +kn)(v ₁+ . . . +v_(n))

only needs to know the value k₁+ . . . +k_(n), i.e. the master key K. In contrast, the individual keys k_(i) are not needed for the decryption. Since the described method aims at large scaled sensor networks, the storage of thousands of pairs (ID, key), which may turn out to be a serious problem, is dispensable. The sink node or any other node that initiates the key and distribution process only needs to know the master key K with K=k₁+ . . . +k_(n). The aggregator nodes, on the other hand, are not required to perform any decryption operation on the incoming data from its successor nodes like it is required when using conventional hop-by-hop encryption. As previously pointed out, this increases the overall system security since there is no lack of security at the aggregator nodes.

The bihomomorphic algorithm may be deterministic or probabilistic. A deterministic one maps the same pair of key and plaintext always to the same ciphertext, whereas for a probabilistic one this mapping differs with some probability.

In a further preferred embodiment the sink node chooses a random value R as some network wide dummy value which may be distributed within the sensor network in the context of the key distribution process. The value R is employed in such a way that each sensor node N_(i) stores for each of its successor nodes the value E_(ki)(R), i.e. the ciphertexts of the random value R wherein k_(i) denominates the keys of the successor nodes. With regard to a highly efficient data aggregation process, each aggregator node in the context of applying the aggregation function employs the default value E_(ki) (R) for those of its successor nodes which do not contribute to the data aggregation process by forwarding a measured value to the aggregator node. By this means non-responding nodes (which, for example, ran out of energy or dropped out due to other failures) can be handled and, as opposed to schemes according to the state of the art, the IDs of responding/non-responding nodes do not need to be provided. Thus, only the aggregated ciphertexts need to be forwarded. This again reduces the communication overhead and, thereby, increases the lifetime of the sensor network. Furthermore, the robustness during the aggregation phase is enhanced what is especially useful due to the highly unreliable medium in which the messages are broadcasted and in which the impact of packet loss can not be ignored. Consequently, even in cases of unreliable links or due to situations in which a subset of nodes can not participate in the sensing and forwarding process, the data aggregation can be performed with high efficiency and security. This is particular effective for sensor networks with a fixed network structure as it is the case for wireless body networks (W-BANS). W-BANS will probably play a prominent role in future e-health architectures as they allow the remote monitoring of customers' health status.

Advantageously, the key distribution is carried out in a secure environment where no attacker is assumed to be in place. After the initial key distribution in the context of an initialisation phase, refreshment phases may be provided in which each sensor nodes' key k_(i) is updated from time to time. In particular, for any deterministic privacy homomorphism used for encryption it is essential to refresh keys ideally after each single aggregation phase. However, for a realistic setting a good balance between the frequency of key refreshment and the required security level should be achieved. It has to be noted that the key refreshment is only useful to support a deterministic privacy homomorphism. For all encryption transformations which are falling into the category of pairwise probabilistic privacy homomorphism it does not yield any benefits.

In principal, the refreshment phase of the keys may be similar to the key distribution in the initialisation phase. The keys k_(i) may be modified such that the old master key K is replaced by a new master K* such that the sink node only chooses and knows the differences K−K*=:Δ, but not the individual keys.

Two different embodiments for the key refreshment prove to be advantageous. According to a first embodiment, the difference Δ is split up and reported to the successor nodes in the same way as the master key K in the initialisation phase. Unlike in the initialisation phase where no attacker is assumed to be in place, updated keys can not be eavesdropped so that for the key refreshment phases there is no need for a secure environment.

Alternatively, each node stores several keys at the same time, derived from different master keys. In this case, the actual key is equal to a specific linear combination of the stored keys whereas the linear combination is uniquely determined. As opposed to the first embodiment, the second one does not require the transmission of data for the key refreshment.

In a further advantageous embodiment the roles of the sensor nodes are changed from time to time. Such a change results in a longer lifetime of the sensor network as, for example, aggregating incoming data and forwarding them is a more energy consuming task than just sensing and sending the measured values.

There are several ways how to design and further develop the teaching of the present invention in an advantageous way. To this end it is to be referred to the patent claims subordinate to patent claim 1 and to the following explanation of a preferred example of an embodiment of the invention, illustrated by the figure on the other hand. In connection with the explanation of the preferred example of an embodiment of the invention by the aid of the figure, generally preferred embodiments and further developments of the teaching will we explained.

In the drawings:

FIG. 1 is a schematic view of an embodiment of a method according to the invention, showing the key distribution, and

FIG. 2 is a schematic view of an embodiment of a method according to the invention showing the data aggregation.

FIG. 1 illustrates schematically an embodiment of the present invention. More precisely, the key distribution process according to the invention is shown. For the purpose of clarity and exemplary illustration, FIG. 1 only shows a very small section of a sensor network 1, the illustrated section comprising only four sensor nodes which are denominated N₀, N₁, N₂, and N₃. In practice, however, the sensor network 1 may include several thousands of sensor nodes N_(i).

The initialisation works top-down from the route of the sensor network 1—the sink node S (not shown)—to the sensor nodes N_(i). In a nutshell, every sensor node N_(i) receives at one point in time during the key distribution his share of the master key K which is chosen by the sink node and which is distributed to the sink nodes' successor nodes in a controlled way. In the embodiment shown in FIG. 1, sensor node N₀ receives his share k of the master key K from its predecessor node (not shown). The sensor node N₀ is aware of his successor nodes—N₁, N₂, and N₃—and, consequently, splits up his share k of the master key K into four subtotals k₀, k₁, k₂ and k₃ with k=k₀+k₁+k₂+k₃ and distributes the three last subtotals to its successor nodes N₁, N₂, and N₃, respectively. The sensor nodes N₁ and N₂ have two successor nodes, respectively, and, consequently, split up their shares—k₁ and k₂, respectively—of the master key K randomly into three subtotals and distribute these values to their successor nodes. On the other hand, the sensor node N₃ has three successor nodes, and, consequently, splits up his share k₃ of the master key K into four subtotals.

At the end of the distribution process, each sensor node N_(i) has received a single symmetric key k_(i) whereas all keys k_(i) are derived from one master key K which is solely stored at the sink node. During the initialisation phase as described above the system is highly vulnerable even to passive attacks. Hence, no attacker is assumed be in place during this phase.

Moreover, although not shown in FIG. 1, the sink node S chooses a random value R. The value R is distributed to all sensor nodes N_(i) of the network 1. For robustness, each sensor node N_(i) stores for each of its successor nodes the cipher E_(ki)(R) wherein k_(i) are the respective keys of the successor nodes. For security reasons, each sensor node N_(i) subsequently deletes all values from its memory except its own key k_(i) and the encrypted values E_(kj)(R) for its successor nodes.

Referring now to FIG. 2, the same small section of the sensor network 1 as in FIG. 1 is shown. In contrast to FIG. 1, FIG. 2 refers to the data aggregation process. During an aggregation phase converge-cast traffic is encrypted end-to-end from the sensor nodes N_(i) to the sink node. Each sensor node N_(i) applies a symmetric homomorphic encryption transformation by encrypting its monitored value with its own unique key and by subsequently summing up the resulting ciphertext to the ciphertexts received from its children/successor nodes.

Referring more particularly to FIG. 2, the sensor nodes N₁ and N₃ have sensed values v₁ and v₃, respectively. On the other hand, sensor node N₂ is exhausted, for example due to empty batteries, and, as a result, does not contribute with a measured value to the data aggregation process. The sensor nodes N₁ and N₃ encrypt their measured values with their individual symmetric keys k₁ and k₃, respectively, and forward the results E_(k1) (v₁) and E_(k3) (v₃), respectively, to their predecessor node N₀.

As the sensor nodes are aware of their relative position within the network, sensor node N₀ realizes that it has not received the data from all its successor nodes N₁, N₂ and N₃. Consequently, it replaces the missing ciphers by the ciphers of a replacement value R which has been distributed during the initialisation phase as a network wide dummy value and which is stored by all sensor nodes N_(i) as described above. As a result, aggregator node N₀ forwards E_(k1)(v₁)+E_(k2)(R)+E_(k3)(v₃) to its predecessor node. At the end, the sink node S receives

E_(Σki)(ΣZv_(i))+E_(Σkj)(ctr*R)

as well as the value ctr wherein ctr is a first counter that denotes the number of dummy values R contained in the aggregation. As the sink node S furthermore knows K, it can easily derive the sum of all measured and reported values. With respect to the non-responding sensor nodes the sink node S at the end of the aggregation process only needs to know how many dummy values are included in the plaintext but not which specific nodes have responded. The counter ctr is incremented according to the number of non-responding nodes and may be transmitted together with the encrypted measured values.

From the information as described above, the sink node S can recover the sum of the measured and reported values but does not know how many values have been aggregated in total. Since these information may be important, a second counter, resp, may be incorporated which computes the number of responded nodes. The counter resp can be handled like the counter ctr. An aggregator node receives from each responding node N_(i) a value resp_(i) and forwards the sum of it to its predecessor node. In the embodiment shown in FIG. 2 the sensor node N₀ would forward the counter ctr incremented by the value 1 (resulting from the non-responding node N₂) and the counter resp incremented by the value 2 (resulting from the two responding nodes N₁ and N₃).

Since each sensor node N_(i) purely stores its own key it can not decrypt the incoming ciphers from its children/successor nodes. Only the sink node S is enabled to decrypt the final aggregated value by applying the master key K to the received ciphers. As previously mentioned, since not always all nodes may have contributed or due to interference on the wireless transmission medium packets may got lost, each intermediate node adds those stored default ciphers E_(ki) (R) to the aggregated ciphered sum which correspond to its direct children which have not provided their input. During an aggregation phase the system is secure against passive and active attacks. No sensor node can decrypt received ciphers. Only the corruption of a full subtree ST(N) would result in gaining knowledge of the aggregated value representing the monitored value of the actual subtree. Subtree ST of a node N in this context is defined as ST(N)={N} if N is a leaf node, otherwise as ST(N)={N}_(U)U_(N′εSucc (N))ST(N′) as the subtree with root N.

Many modifications and other embodiments of the invention set forth herein will come to mind the one skilled in the art to which the invention pertains having the benefit of the teachings presented in the foregoing description and the associated drawings. Therefore, it is to be understood that the invention is not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation. 

1-19. (canceled)
 20. A method for aggregating data in a network, particularly in a wireless sensor network, wherein the network (1) comprises a plurality of sensor nodes (N_(i)) to measure data and at least one sink node (S) at which the data measured by the sensor nodes (N_(i)) are aggregated, and wherein each sensor node (N_(i)) encrypts its measured data with a key k_(i) and forwards the result towards the sink node (S), characterized in that, in the context of a key distribution within the network (1), a master key K is chosen, and that the master key K is autonomously split up by the network (1) into individual keys k_(i) to be used by the sensor nodes (N_(i)) for encrypting measured data, with the sum of all individual keys k_(i) being equal to the master key K.
 21. The method according to claim 20, wherein the master key K is chosen by the sink node (S).
 22. The method according to claim 20, wherein the master key K is solely stored at the sink node (S).
 23. The method according to claim 20, wherein the flow of data in the network (1) has a converge-cast traffic structure.
 24. The method according to claim 20, wherein the sensor nodes (N_(i)), with the exception of the sink node (S), have one predecessor node (F_(ni)), respectively, and wherein the sensor nodes (N_(i)), as the case may be, have one or more successor nodes (C_(Ni)), respectively.
 25. The method according to claim 20, wherein means are provided by which each sensor node (N_(i)) is informed of its relative position within the network.
 26. The method according to claim 24, wherein each sensor node (N_(i)), in the context of the key distribution process, upon receipt of a certain subtotal K_(i) of the master key K from its predecessor node (F_(Ni);) splits up the value K_(i) into a number n of subtotals K′_(I), wherein each sensor node (N_(i)) may define the number n of subtotals K′_(i) in accordance with the number of its direct successor nodes.
 27. The method according to claim 24, wherein at least some of the sensor nodes (N_(i)) function as aggregator node (A_(i)) by aggregating the measured data of its successor nodes (C_(Ai)).
 28. The method according to claim 27, wherein an aggregator node (A_(i)) performs an aggregation function on incoming encrypted data, forwarded to the aggregator node (A_(i)) by its successor nodes (C_(Ai)), and on its own measured encrypted data before forwarding the result towards the sink node (S), wherein the aggregation function may be an addition operation, a subtraction operation, a multiplication operation or an inverse multiplication operation.
 29. The method according to claim 20, wherein the algorithm used for encryption is homomorphic both with respect to the keys k_(i) used for encryption and with respect to the values v_(i) to be encrypted, wherein the homomorphic algorithm may be a deterministic or probabilistic privacy homomorphism.
 30. The method according to claim 24, wherein the sink node (S) chooses a random value R as some network wide dummy value and wherein each sensor node (N_(i)) stores the random value R encrypted with the keys k_(i) of its successor nodes (C_(Ni)) as replacement values E_(ki) (R).
 31. The method according to claim 28, wherein each aggregator node (A_(i)), in the context of applying the aggregation function, employs the replacement values E_(ki) (R) for those of its successor nodes (C_(Ai)) which do not contribute to the data aggregation process by forwarding a measured value to the aggregator node (A_(i)).
 32. The method according to claim 20, wherein the key distribution is carried out in a secure environment.
 33. The method according to claim 20, wherein each sensor node's (N_(i)) key k_(i) is updated from time to time during a key refreshment phase, wherein key refreshment may be carried out after each single aggregation phase.
 34. The method according to claim 20, wherein the roles of the sensor nodes (N_(i)) are changed from time to time.
 35. The method according to claim 21, wherein the master key K is solely stored at the sink node (S).
 36. The method according to claim 25, wherein each sensor node (N_(i)), in the context of the key distribution process, upon receipt of a certain subtotal K_(i) of the master key K from its predecessor node (F_(Ni);) splits up the value K_(i) into a number n of subtotals K′_(I), wherein each sensor node (N_(i)) may define the number n of subtotals K′_(i) in accordance with the number of its direct successor nodes. 